Search results for " Statistically Validated Networks"

showing 8 items of 8 documents

Statistically Validated Networks for assessing topic quality in LDA models

2022

Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) (Blei et al., 2003) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution overwords characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be only characterized by a set of irrelevant or unchained words, being useless for the interpretation. Although many topic-quality metrics were proposed (Newman et al., 2009; Alet…

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.Settore SECS-S/01 - StatisticaTopic Model Topic Coherence LDA Statistically Validated Networks
researchProduct

MEASURING TOPIC COHERENCE THROUGH STATISTICALLY VALIDATED NETWORKS

2020

Topic models arise from the need of understanding and exploring large text document collections and predicting their underlying structure. Latent Dirichlet Allocation (LDA) (Blei et al., 2003) has quickly become one of the most popular text modelling techniques. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models give no guaranty on the interpretability of their outputs. The topics learned from texts may be characterized by a set of irrelevant or unchained words. Therefore, topic models require validation of the coherence of estimated topics. However, the automatic evaluation …

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.topic model topic coherence LDA statistically validated networks.Settore SECS-S/01 - Statistica
researchProduct

Statistically validated mobile communication networks: the evolution of motifs in European and Chinese data

2014

Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, gen…

FOS: Computer and information sciencesPhysics - Physics and SocietyBig dataFOS: Physical sciencesGeneral Physics and AstronomyPhysics and Society (physics.soc-ph)computer.software_genre01 natural sciences010305 fluids & plasmassymbols.namesake0103 physical sciences010306 general physicsProxy (statistics)Social and Information Networks (cs.SI)PhysicsSocial networkbusiness.industryComputer Science - Social and Information NetworksComplex networkcomplex networks social systems statistically validated networks mobile call records 3-motifsSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Bonferroni correctionMobile phonesymbolsMobile telephonyData miningRaw databusinesscomputer
researchProduct

STRANIERI, MERIDIONALI O PROVINCIALI? I CONSUMI NEL TEMPO LIBERO DELLE SECONDE GENERAZIONI

2022

In this paper, we analyze consumption patterns of leisure time among young people belonging to the so-called “second generation” of immigrants in Italy. Leisure time consumption describes how young immigrants use cultural products and services. We analyze data collected by the ISTAT through the survey on the “second generations” (2015). A comparison of leisure consumption patterns between second-generation immigrants and their Italian peers does not show significant differences. Rather, differences in consumption styles are associated to gender (male/female), geographic area of residence (North/South), and size of the municipality (large municipality/small municipality) of residence.

Settore SPS/07 - Sociologia GeneraleYoung immigrants Leisure time consumption Social integration Statistically validated networks.
researchProduct

Households and their Expenditures as an Evolving Complex Social System

2020

Household consumption Complex Systems Statistically Validated NetworksSettore SPS/07 - Sociologia Generale
researchProduct

Dynamics of fintech terms in news and blogs and specialization of companies of the fintech industry

2020

We perform a large scale analysis of a list of fintech terms in (i) news and blogs in English language and (ii) professional descriptions of companies operating in many countries. The occurrence and co-occurrence of fintech terms and locutions shows a progressive evolution of the list of fintech terms in a compact and coherent set of terms used worldwide to describe fintech business activities. By using methods of complex networks that are specifically designed to deal with heterogeneous systems, our analysis of a large set of professional descriptions of companies shows that companies having fintech terms in their description present over-expressions of specific attributes of country, muni…

Physics - Physics and SocietyApplied MathematicsEconomic sectorFintech Statistically validated networksGeneral Physics and AstronomyFOS: Physical sciencesStatistical and Nonlinear PhysicsEnglish languagePhysics and Society (physics.soc-ph)Complex networkBusiness activities01 natural sciencesSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)010305 fluids & plasmasSet (abstract data type)FOS: Economics and businessDynamics (music)0103 physical sciencesSpecialization (functional)Business010306 general physicsGeneral Finance (q-fin.GN)Quantitative Finance - General FinanceMathematical PhysicsIndustrial organization
researchProduct

Statistically Validated Networks for evaluating coherence in topic models

2022

Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be characterized by a set of irrelevant or unchained words, being useless for the interpretation. In the framework of topic quality evaluation, the pairwise semantic cohesion among the top-N most pr…

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.Text Mining Probabilistic Topic Models Topic coherence Statistically Validated NetworksSettore SECS-S/01 - Statistica
researchProduct

A primer on statistically validated networks

2019

In this contribution we discuss some approaches of network analysis providing information about single links or single nodes with respect to a null hypothesis taking into account the heterogeneity of the system empirically observed. With this approach, a selection of nodes and links is feasible when the null hypothesis is statistically rejected. We focus our discussion on approaches using i) the so-called disparity filter and ii) statistically validated network in bipartite networks. For both methods we discuss the importance of using multiple hypothesis test correction. Specific applications of statistically validated networks are discussed. We also discuss how statistically validated netw…

complex networks statistically validated networksSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)
researchProduct